JT's IBM Data Science Capstone Assignment:

Red and Blue Gerrymandering

What determines (alleged) Georgia gerrymandering?

(image courtesy of economicmodeling.com)

Introduction (Business Problem Summary):

Amongst the issues affecting election campaigning, Gerrymandering is one of the most vexing and arcane. Given the contentious nature of this topic in Georgia (USA) and the close results of recent elections, it is important that political campaign management firms (as well as voters) understand the effects of this issue in order to better target candidate marketing strategies during the next Georgia elections. Demographic analysis of the Georgia districts, most of which had undergone gerrymandered re-districting, will be important to their success.

Note that the target audience for this analysis are political campaign management firms (often hired by congressional candidates) who do business in Georgia. Such firms typically control investment strategies for the candidate, including types of adverts to place, locations to focus on, and delivery channels for important candidate communication. These strategies are aimed at achieving the highest voter turnout for their candidate, therefore, demographic analysis is vital to the success and reputation of the campaign management firm (as well as the candidate).

Gerrymandering is a practice intended to establish an unfair political advantage for a particular party or group by manipulating election district boundaries. "Districts" define geographical boundaries, with each district within a state being geographically contiguous and having about the same number of state voters.

In recent years, districting policies in Georgia, USA have been hotly debated recently, particularly during the 2018 gubernatorial election run-off between Stacy Abrahms and current Governor Brian Kemp. Accusations of a ‘rigged process’ were rife, as redistricting often resulted in varied and interesting “geographically-contiguous” shapes:

SOUTH%20CAROLINA.png

SOUTH%20CAROLINA.png

Greenville.png

Given that personal policital ideologies have shifted over time in given locations, understanding this phenomena is essential. We will be evaluating the demographics in Georgia including contrasting Georgian ‘Red' districts (Republicans) with ‘Blue' districts (Democrats) to see what comprises each type. These observations will inform investments campaign firms should consider to combat the negative effect of gerrymandering on candidate success.

Caveats: Please note that this is exploratory analysis (in the loosest sense of the word); my results and observations could mislead at a time where accurate information ("truth") is under stress. Moreover, to conduct such analysis properly, I would need access to more data (eg, cuts of information by year pre and post redistricting, more granular income distribution and education reporting); such data is currently not freely available.

-----END OF INTRODUCTION SECTION-----

START OF DATA SECTION:

Method and Data Requirements:

I will review certain characteristics of "red" (Republican) and "blue" (Democrat) districts:

  • population voting history
  • education
  • age
  • local amenities (venue categories)
  • poverty level

These features will be used with a k-means clustering process which will present groupings that can be evaluated for strategic review and investment. These features will be sourced from following (samples included):

https://ballotpedia.org/Redistricting_in_Georgia - congressional districts by number, current representative by full name, and current party affiliation as well as term, election victoty margins, district ethnic demographics. Information is conveyed in several tables included in this one webpage. Samples: image.png image.png

https://www2.census.gov/programs-surveys/demo/tables/voting/table01.xlsx - "Number of Votes Cast, Citizen Voting-Age Population and Voting Rates for Congressional Districts: 2018" Sample: image.png

https://www2.census.gov/programs-surveys/demo/tables/voting/table02a.xlsx - "Characteristics (Age) of the Citizen Voting-Age Population for Congressional Districts: 2018" Sample: image.png

https://www2.census.gov/programs-surveys/demo/tables/voting/table02c.xlsx - "Characteristics (Educational Attainment) of the Citizen Voting-Age Population for Congressional Districts: 2018" Sample: image.png

https://www2.census.gov/programs-surveys/demo/tables/voting/table02b.xlsx - "Characteristics (Sex and Poverty) of the Citizen Voting-Age Population for Congressional Districts: 2018" Sample: image.png

https://developer.foursquare.com/docs/build-with-foursquare/categories - And of course, FourSquare data for venue categories, with locations pulled from geopy (if it cooperates for me). Sample: image.png

-----END OF DATA SECTION-----

In [ ]: